Decay of Correlations for Sparse 
Graph Error Correcting Codes 

Shrinivas Kudekar, Nicolas Maoris 

Communication Theory Laboratory 
School of Computer and Communication Sciences 
Ecole Polytechnique Federale Lausanne 
CH-1015 Lausanne, Switzerland 
shrinivas. kudekar@epfl.ch, nicolas.macris@epfl.ch 

March 10, 2009 



Abstract 

The subject of this paper is transmission over a general class of 
binary-input memoryless symmetric channels using error correcting 
codes based on sparse graphs, namely low-density generator-matrix 
and low-density parity-check codes. The optimal (or ideal) decoder 
based on the posterior measure over the code bits, and its relationship 
to the sub-optimal belief propagation decoder, are investigated. We 
consider the correlation (or covariance) between two codebits, aver- 
aged over the noise realizations, as a function of the graph distance, 
for the optimal decoder. Our main result is that this correlation de- 
cays exponentially fast for fixed general low-density generator-matrix 
codes and high enough noise parameter, and also for fixed general low- 
density parity-check codes and low enough noise parameter. This has 
many consequences. Appropriate performance curves - called GEXIT 
functions - of the belief propagation and optimal decoders match in 
high/low noise regimes. This means that in high/low noise regimes the 
performance curves of the optimal decoder can be computed by den- 
sity evolution. Another interpretation is that the replica predictions 
of spin- glass theory are exact. Our methods are rather general and 
use cluster expansions first developed in the context of mathematical 
statistical mechanics. 
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1 Introduction 



Low-density parity-check (LDPC) codes based on sparse graphs have emerged 
as a focal point in the theory of error correcting codes, used in noisy channel 
communication, largely because they are amenable to low complexity decod- 
ing and at the same time have a good performance (measured as the gap to 
Shannon's capacity). An important class of low complexity decoders are the 
message passing iterative decoders. In this framework, in order to decode 
a bit attached to a node of the graph, one unravels a computational tree 
(or covering tree) and iteratively updates messages (suitable functions of the 
channel output observations) passed along the edges of the computational 
tree. We refer to the recent book [T] for the state of the art of this general 
theory. One would also like to be able to compare sub-optimal message pass- 
ing decoders with the optimal or ideal decoder. The later is based on the 
posterior probability distribution supported on code-bits and is optimal in 
the sense that it is known to minimize the bit-error-rate among all decoders 
(it is also called MAP decoder, and this is the terminology that we adopt in 
this paper). A priori the comparison of decoders is not easily done since the 
MAP decoder is in general computational complex. 

One of the most important low complexity message passing decoders is 
the belief propagation (BP) algorithm. It is well known that for a code whose 
graph is a tree, the BP algorithm has the same performance as the MAP de- 
coder. This essentially comes from the fact that on a tree the computational 
graph of a node matches the original graph itself. However, codes based on 
tree graphs have poor performance and one needs to consider graphs with 
loops or cycles. With cycles in the original graph, the messages on the com- 
putational tree are no longer independent, and it is not a priori clear, if 
and why, the BP algorithm should retain any close relationship to the MAP 
decoder. A fundamental theoretical tool that allows to analyze the BP algo- 
rithm is density evolution (DE) first developed in [2]. From DE one can for 
example obtain a noise threshold above which reliable communication is not 
possible with BP decoding. The analysis proceeds by taking first a very large 
block length n and looking at d <C n iterations of the BP decoder. Eventualy 
one considers the asymptotics lim^^+oo lim^^+oo- However, in the practical 
use of the decoder one fixes n (large) and the iterations d ^ n are per- 
formed till one reaches an acceptably small bit-error-rate. This corresponds 
to the asymptotics lim„^_|_oo lim sup /inf^^+oo- The practical success of den- 
sity evolution relies on the equivalence of these two limiting procedures, an 
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open problem in general. 

It is fair to say that these issues have been resolved over the binary erasure 
channel (BEC) |T] (the analysis however has not proceeded from the point 
of view of the correlations of the MAP decoder). An important tool in the 
analysis over the BEC has been the "extended BP" extrinsic information 
transfer (EXIT) curve (which is a suitable continuation of the bit-error-rate 
curve under BP decoding). Recently it was shown that the bit-error-rate 
of the MAP decoder can be obtained from that of the extended BP EXIT 
curve by a Maxwell construction just as in the theory of first order phase 
transition^ [6], [7]. This construction allows to compute the MAP noise 
threshold, and to compare it to the BP noise threshold. The validity of the 
exchange of limits d +00 and n — * +00 (for the BP decoder) can also be 
derived for the BEC using natural monotonicity properties of the decoder 

m 

For the case of transmission over more general channels very little is 
known about these issues. Indeed there one lacks the combinatorial methods 
available for the BEC, and radically new methods have to be used. Conve- 
nient measures of the performance, which generalize the EXIT curves, are 
the so-called GEXIT curves [1] (see the next section for their precise defini- 
tion). It is believed that in terms of these, the results obtained for the BEC 
still hold. In particular, the GEXIT curves for the BP and the MAP decoder 
should match for high and low noise regimes away from the phase transition 
thresholds. Such conjectures are supported by spin glass theory calculations 
(e.g the replica and cavity methods) which provide conjectural but analytic 
formulas. One-sided bounds have been derived for the GEXIT curves by 
the (information theoretical) method of physical degradation [1] and also by 
using correlation inequalities valid for spin glasses [13]. Related bounds on 
the conditional input-output entropy have also been derived [3], [1] by us- 
ing "interpolation methods" first developed in the mathematical theory of 
spin glasses [5], [8], [9]. As it turns out, all these bounds match the replica 
expressions and are therefore believed to be the best possible. In [10] the 
interpolation method has been extended to obtain the converse bounds for 
a class of Poissonian LDPC codes over the BEC, thus recovering combina- 
torial results of [11] in a completely different way. Concerning the problem 

^The extended BP curve corresponds to the pressure- volume curve of the Van der 
Waals theory of the hquid-gas transition, and the MAP curve corresponds to the isotherms 
obtained by MaxweU's equal area construction. 



3 



of exchanging the d,n ^ +00 hmits we refer to [12] for recent progress that 
goes beyond the BEC. 

In this work we will show that a good deal can be learned by looking at 
the correlations (more precisely the covariance), averaged over the channel 
outputs, of the MAP decoder. We comment below about the methods used, 
but let us say at the outset that our aim is to cover a fairly general class of bi- 
nary input memoryless symmetric channels, including the binary symmetric 
and gaussian ones. One of our main results is that for sufficiently low noise 
(LDPC codes) the correlations between two code-bits decay exponentially 
fast as a function of the graph distance between the two code-bits, uniformly 
in the block length size n. The sparsity of the underlying graph then implies 
that, if furthermore the decay rate beats the local expansion of the graph, 
the MAP GEXIT curve can be computed by DE. Another interpretation of 
this result is that the solutions provided by the replica/cavity methods of 
spin glass theory are exact. 

Low-density generator-matrix codes (LDGM) codes have a very clear re- 
lationship to spin glass models on random graphs and it is useful to study 
them before we can attack the harder case of LDPC codes. Besides, the 
present analysis could potentially be useful in other contexts where they are 
used (e.g. rateless codes, source coding). For high noise we prove the decay 
of correlations and that the MAP GEXIT curve can be computed by DE. 
For that system, we can also show that the decay of correlations implies that 
the limits d, n — > -|-oo can be exchanged for the BP decoder, at least on the 
binary symmetric channel. 

The study of the behavior of correlations as a function of the distance 
between local degrees of freedom is one of the central aims of statistical 
physics. For lattice spin systems (e.g. the Ising model) an important crite- 
rion that ensures correlation decay is Dobrushin's criterion [12] - which is of 
probabilistic nature - and its various improvements. The main other method 
- which is not necessarily of probabilistic nature - is based on suitable ex- 
pansions in powers of "the strength of interactions". There exists a host of 
such expansions collectively called "cluster expansions", and the context of 
spin systems the first and simplest such expansion is the so-called "polymer 
expansion" [16]. The main rule of thumb is that all these methods work if 
the degrees of freedom are weakly interacting, or if one can transform the 
original system into an effective one involving new weakly interacting degrees 
of freedom. It turns out that sophisticated forms of the cluster expansions 
can be carried out, for LDGM codes in a high noise regime, and for LDPC 
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codes in a low noise regime, for a fairly general class of channels including 
the binary symmetric channel (BSC) as well as the binary input additive 
white gaussian noise channel (BIAWGNC). As we will explain later it is nec- 
essary to use quite sophisticated cluster expansions for at least two reasons. 
Concerning LDGM codes Dobrushin's criterion and the polymer expansion 
require bounded channel outputs (and thus do not covers the case of the 
BIAWGNC). Concerning LDPC codes one has to transform the system to a 
dual one that involves "negative Gibbs weights" and cannot even be treated 
by probabilistic methods. 

The rest of the paper is organized as follows. In section [2] we formulate the 
models and give a unified view of the main results both for LDGM and LDPC 
codes. The main strategy of the proofs is also explained there. Sections [3] 
and m contain the proofs of correlation decay and its consequences for the 
GEXIT curves. The problem of exchanging the limits of iteration number 
and block length size is addressed in section [51 We conclude by pointing 
out open problems and further connections to the recent literature. The 
appendix reviews in a streamlined form the two cluster expansions that are 
used in sections [3] and HI 

Summaries of the present results have been reported for the special case 
of the BIAWGNC [H], [H]. 



2 Models and Main Results 

We consider binary-input memoryless output-symmetric channels defined by 
a transition p.d.f pvixiv \ x) with inputs x G {—1, +1} and outputs belonging 
to M. Since we use techniques from statistical mechanics it is convenient to 
immediately map the usual input alphabet {0, 1} to { — 1, +1}. Symmetry of 
the channel means that PY\x{—y \ —x) = PY\x{y \ x). The intensity of the 
noise is called e. It will be convenient to trade off the channel outputs y for 
the half-loglikelihood 



/ = - In 
2 



'PY\x{y\ + 1) 



.PY\xiy\-i)\ 

It is well known that on a symmetric channel one can assume without loss 
of generality that the all-one codeword (i.e the usual all-zero codeword) is 
transmitted and therefore the channel outputs are i.i.d with distribution 
PY\x{y\ + ^)dy = c{l)dl. For clarity, we assume that the noise parameter 
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varies in an interval [0, emax] {^max is possibly infinite) where e — corre- 
sponds to low noise and e ^max corresponds to high noise. For example, 
(-max = \ for the BSC, emax = +00 for the BIAWGNC (and emax = 1 for the 
EEC). The general class of channels for which our main results hold is 

Class of channels. We define the class JC of binary-input memoryless 
output- symmetric channels: 

1. The numbers T2p{e) = ^ dl c{l){taxih.lY^ are bounded uniformly 
with respect to p > 1 integer. 

2. For any finite m > we have E[e'"'''] < Cm < +00 

3. (Low noise condition) There exists sq > small enough such that for 
< s < Sq we have hm^^^o ^[e"'*'] = 0. 

4- (High noise condition) Set 6{e,H) = e'^^ — 1 + P(|/| > H). One can 
find H{e) such that \im^^^^^^ 5{e, H{e)) = 0. 

Note that this class is not the most general that we can treat but it is at 
the same time fairly general and keeps the analysis at a technically reasonable 
level. An important example is the BSC (we keep < e < |) 

PY\xiy\x) = (1 - e)6{y - x) + e6{y + x), 

c(/) = (1 - e)6{l - \ In i^) + e6{{l - ^ In ^) (2) 

One can check that the conditions are met with T2p{e) = 2p{l — 2e)^^~^, 
E[e'"l'l] = (i^)f , E[e-^'] = ei(l - e)i-t + (1 - e)f e^'t and H{e) =\og^. 
Another important example is the BIAWGNC 

Pm(.lx) - ^e.p(-(^), c(/) ^ -l=exp(-(i^) (3) 

Again one can check that the conditions are met with T2p{e) < dl |^^^|, 
E[e™l'l] < cx), E[e-"'] = e-^^^'^^-f) and H{e) = le'^l^. Note that the BEC is 
not contained in the class /C because of the second condition. Nevertheless 
due to the special nature of this channel our methods can easily be adapted, 
but we will not give the details here since this is a case that has already been 
thoroughly analyzed in the literature [T]. 
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Fixed LDGM codes are constructed from a fixed bipartite graph with m 
information-bit nodes (variable nodes) and n code-bit nodes (check nodes), 
and edges connecting variable and check nodes only. The design rate of the 
code R = ^ is kept fixed. The set of neighbors of a variable node a is called 
da and the set of neighbors of a check node i is called di. We consider graphs 
with bounded node degrees \da\ < /max and \di\ < k^a.x- Information bits 
ui, Urn G { — 1, +1}"* are attached to the variable nodes and the code-bits 
Xi, ...,Xn attached to the check nodes are obtained as 

a^i = J]^ Ma, i = 1, ■■■,n (4) 

We also consider ensembles of such codes defined by random graph construc- 
tions. We do not explain the details of these constructions here except for 
saying that an LDGM(A, P) ensemble is specified by the generating functions 
of variable (resp. check) node degree distributions A(^) = Yl\=T (resp. 

Fixed LDPC codes are similarly constructed from a fixed bipartite graph 
with n variable nodes i = 1, n (this time these are the code-bit nodes) and 
m check nodes c = 1, ...m, with edges connecting variable and check nodes 
only. The design rate is i? = 1 — — is fixed. We assume that the node degrees 
are bounded \di\ < lm_a.y. and \dc\ < /cmax- The code-bits attached 
to the variable nodes satisfy m parity check constraints 

j^Xi = l, c = l,...,m (5) 

We also consider ensembles of such codes defined by random graph construc- 
tions; an ensemble is specified by the generating functions of variable (resp. 
check) node degree distribution K{z) = ^\ZT ^iz^ (resp. P{z) = ErTi" ^rz'') 

m 

The optimal MAP decoder is based on the posterior measure of the trans- 
mitted codeword given the received message = (yi, .., ?/„). For LDGM 
codes this conditional measure is best viewed as being supported on infor- 
mation bits, 

1 " 

p^™|y„(n"^|y") = -J]e''n..«-a (6) 

i=l 
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For LDPC codes the conditional measure is 



px^Mx-\y^) = ^ n + n n ^'^^ 



(7) 



c=l iefc 1=1 



In both cases Z is the appropriate normahzation factor. These measures 
are random because of the channel outputs and possibly because the code 
is chosen at random from an ensemble. The average with respect to the 
channel outputs is often denoted by E;n and the average with respect to a 
code ensemble is generically denoted by Ec. We will also use the notation 
Ep\i when the average is over all outputs except the i-th. one. A crucial point 
is that the interactions or constraints in these measures are local so that they 
can be analyzed with the tools developed in the theory of Gibbs measures 
[T2] . We use the bracket notation (/) = 



for the Gibbs averages of functions /. It turns out that even for ([6]) we will 
only need to look at averages of functions of the transmitted codebits x"; for 
example (xj) = (HiGa^a)- important to remember that the bracket is 
defined for finite n although we do not write explicitly (— )„ to alleviate the 
notations. The average (over noise realizations) Gibbs entropy of the two 
measures is nothing else than Shannon's input-output conditional entropy 
iiy(f/'"|r"), lH{X'^\Y'^) denoted in both cases by K- The MAP-GEXIT 
function is simply defined as the e derivative of this conditional entropy. 
When this derivative is performed one finds that the MAP-GEXIT function 
is a functional of the soft-bit MAP estimat^ (xj) (or the magnetization). It 
is much more convenient, in fact, to express it as a functional of the extrinsic 
estimate (xj)o computed for /j = 0, 



The explicit form of the functionals corresponding to LDPC and LDGM 
codes is given in sections [3] and H] (see also [1], [I3]). 

Let us now describe the BP decoder from the point of view of Gibbs 
measures. Given a graph G defining a given LDGM or LDPC code with 

^the magnetization 



Y^f{u^)pu^\Y-{u^\y''). $^/(a^")Px"|y"(^"|?/'^) 



(8) 




(9) 
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for n and m fixed (large), we choose a code-bit node i and construct the 
computational tree Tfi{i) of depth d (even). This is the universal covering 
tree truncated at distance d from node i. We label the variable/check nodes 
of this tree with new independent labels denoted n. Let ir : Td{i) ^ G be the 
projection from the covering tree to the original graph. A node u G Td{i) has 
an image vr(z/), and due to the loops in G this projection is a many to one 
map: one may have u 7^ u', 7r(i^) = vr(z/'). Now, consider a tree-code defined 
in the usual way on the tree-graph Td{i). One can view the BP decoder for 
node MAP decoder for this tree-code. In other words the BP decoder 

uses the Gibbs measure on Tdii): one crucial point is that for this Gibbs 
measure the half-loglikelihood variables attached to the nodes are no longer 
independent. For the LDGM case the measure is 



while for LDPC case 



1 



cGTtj(j) k&dc keTd{i) 



BP 



where in each case ^T<j(i) is the proper normalization factor. We call (— )rf 
the Gibbs bracket with respect to these measures. The extrinsic BP soft-bit 
estimate is {xi)^^ . The BP-GEXIT function can be definedH in terms of the 
same functional than in 

<^(e)=Ec[^((x.)jr)] (12) 

The soft-bit estimate {xi)^^ can be computed exactly by summing the spins 
starting from the leaves oiTd{i) all the way up to the root i. This computation 
is left to the reader and yields the usual message passing BP algorithm. 

We are now ready to describe our main results. The main one concerns 
the exponential decay of the average correlation between two code-bits Xi and 
function of their graph distance dist(i,j), uniformly in the system 

size n. 



^Thc definition adopted here is very natural from tlie point of view of the measures 
(Uni) and In [1] another definition is given that is more natural from the point of 

view of information theory. It is not difficult to show that they are equivalent as n ^ +00 
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Theorem 1 (Decay of correlations for the MAP decoder). Consider com- 
munication over channels K. Take a fixed LDGM code at high enough noise 
eg < e < emax or a fixed LDPC code at low enough noise < e < ep, where 
eg, ep > depend only on /max, ^max- Then 



where Ci is a finite positive numerical constant and ,^(e) is a strictly positive 
constant depending only on e, /^ax one? A;jnax- In- both regimes we have that 
^~^(e) grows with e and e — > Cmax- 

Let us say a few words on the strategy used to prove this theorem. As 
explained in the introduction, for LDGM at high noise and for channels with 
bounded loghkehhood variables, (|T3|) foUows from Dobrushin's criterion or 
from the polymer expansion. These however do not work when the hkelihood 
variables are unbounded because, roughly speaking, overlapping polymers in- 
volve moments which can spoil the convergence as m — >■ +oo. More 
physically, what happens is that even in the high noise regime there always 
exist with positive probability large portions of the graph that are at low 
noise (or "low temperature" )□. We use a very convenient cluster expansion 
of Dreifus-Klein-Perez [20] that overcomes this problem by organizing the 
expansion over self-avoiding random walks on the graph. Since the walks 
are self-avoiding the moment problem does not occur and we can treat un- 
bounded loglikelihoods. For LDPC codes the situation is more subtle because 
of the hard parity-check constraints that give an inherently low temperature 
flavor to the problem. From a purely code theoretical point of view it is 
known that LDPC codes are the dual of LDGM codes. This algebraic dual- 
ity can be exploited to transform the low noise communication model with 
LDPC codes to a dual model which, although not a genuine high noise com- 
munication model with LDGM codes, still retains this flavor. In fact this 
dual model involves "negative Gibbs weights". For this reason the cluster 
expansion of [20] does not work anymore and we use resort to another one 
first devised by Berretti [21]. The two cluster expansions have to be adapted 
to our setting and are therefore reviewed in a somewhat streamlined form in 
Appendix \^ 

^See [19] for a nice discussion of this point related to the Griffith's singularity in the 
spin glass context 




(13) 
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Remark 1 . The proof of theoremU\ will make it clear that for LDGM codes on 
channels with hounded loglikelihoods (e.g the BSC) at high noise, the average 
correlation E[(xjXfc)f — (xj)f^(xfc)f^] between the root node and another one 
decays exponentialy fast (here E is over the noise. See for related work 
based on Dobrushin's criterion. The likelihood variables over Td{i) are not 
independent anymore so that the unbounded case is even more complicated 
now and will not be discussed here. 

Our first corollary says that the MAP-GEXIT function can be computed 
by the DE analysis in high/low noise regimes. It also shows that the replica 
expressions computed at the appropriate fixed point are exact. 

Corollary 1 (Density evolution allows to compute MAP). Consider com- 
munication over channels /C. For ensembles LDCM(A, P) with high enough 
noise e'g < e < emax (^nd LDPC(A, P) with low enough noise < e < e'^ we 
have 

lim gn{e) = lim lim g^^{e) (14) 
Here e'g and e'p depend only on /max; k^a.x- 

This result extends to the class of channels /C those obtained previously on the 
BEG [1], [6]. In the case of LDPG ensembles with a vanishing GEXIT curve 
for e < e* it is known that the result can be more easily obtained by physical 
degradation [7j or correlation inequalities [I3][T1] for e < e*. However there 
are ensembles with a GEXIT curve that is non trivial all the way down to 
e (for example the Poisson LDPG ensemble) and for which the theorem 
is new. Note that it applies whether there is or not a phase transition (e.g 
a jump discontinuity in the GEXIT curve): so it applies even in situations 
where the area theorem does not allow to prove f|T^ . The values obtained for 
e'pg are worse than those Cp^g obtained in theorem [1] This is not surprising in 
view of the following remarks. It is expected (and for the BEG in some cases 
it is known) that the equality (fT4l) is true as long as the noise parameter does 
not lie in a window around the phase transition threshold where this window 
is determined by an extended form of the BP-GEXIT curve (an S shaped 
curve). On the other hand inside the window, close to the phase transition 
threshold, it is known that (fT4|) cannot hold. A look at the proof shows that 
the decay of correlations always implies iHM only if this decay is fast enough 
to beat the expansion of the graph: in other words if ^ ln(/max/;^max) ^ 1- Our 
estimates allow to control the growth of with respect to e to show that 
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such a regime exists. Therefore in a window close to the phase transition 
threshold, even if the correlations decay, ^ ln(/niax^max) ^ 1 cannot be valid. 

Finally concerning the exchange of limits d,n ^ +00 for the BP algo- 
rithm we prove 

Theorem 2 (Exchange of limits) . Consider communication over the BSC. 
For LDCM(A, P) ensembles with bounded degrees with high enough noise 
e'g <e < emax, depending only on Z^ax, /j^max, we have 

lim lim ^^^(e) = lim lim sup J(e) = lim liminf^fj(e) (15) 

The proof is a simple application of the decay of correlations. We present 
it only for the BSC but it can also be extended to any convex combination 
of such channels and more generaly as long as c{l) has a bounded support 
that diminishes as the noise parameter increases. The cases of unbounded 
support (such as BIAWGNC), or of LDPC codes at low noise, require more 
work and will not be discussed here. The present result complements the 
recent work [15] which concerns the bit-error-rate of LDPC codes for other 
message passing decoders in the regime where the error rate vanishes. 



3 LDGM Codes: High Noise 

In this section we prove theorem [1] and its corollary for LDGM codes. It is 
convenient to set K = /max^max- 

Proof of Theorem [T], LDGM. First we define the self-avoiding random walks 
on which the cluster expansion is based. A self-avoiding walk w between two 
variable (information-bit) nodes a, is a sequence of variable nodes (denoted 
wi, ^2, • • • , vi+i) and checks (denoted ci, C2, . . . , q), f 1, ci, f2, C2, . . . , Q, vi+i such 
that vi = a, vi+i = b and {vm, t^m+i} e dcm and Vm ^ Vn, Cm 7^ c„ for m n. 
We also say that two variable nodes a, b are connected if and only if there 
exists a self-avoiding walk from a to b. Thus on a self-avoiding walk we do not 
repeat variable and check nodes. From any general walk between a and b we 
can extract a self-avoiding walk w between a and b which has all its clauses 
belonging to the parent walk (this is done by chopping off all the loops of the 
general walk). The length \w\ of the walk is the number of variable nodes 
in it. If a = b then the self-avoiding walk from a to 6 is the trivial walk a. 
We define the length of such walks to be zero. Let Wab denote the set of all 
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Figure 1: Each set A and B contains three variable nodes. The hght squares 
denote the generator bits in the complement of B and the dark squares denote 
the generator bits in B. The thick path is an example of a self-avoiding path 
between A and B which contributes to the upper bound. The dashed path 
is a non-self-avoiding path and does not contribute to the bound. 

self-avoiding walks between variable nodes a, b and Wab = ^aeAfiesWab (see 
figured]). Fix some number H > (that will depend on e later on). Denote 
by B the set of all code-bit nodes i (checks), such that > H. We use the 
following (see Appendix lAl for the proof) 

Lemma 1. Consider any LDGM code with hounded left and right degree. 
Consider two sets of information-hit nodes A, B with hounded support. We 
have 

\{\{ua\{u,) - {\{ua){\{u,)\<2 (16) 

asA 6gB a£A beB w£Wab 

where Pi = I, if i E B and pi = e^''*' — 1, ifi^B. 

The crucial feature of this lemma is that the pi are independent random 
variables because the walks are self-avoiding. Consequently, averaging over 
the noise realization in (fT6l) 

Ez"i(n^<^n^'')-(n^«)<n^'')i ^2 ^ n^N ^^^^ 

aeA beB a£A beB weWAB «Gty 

Now, 

E[p,] < E[p, I i i i3]P(i iB)+ E[p, 1 1 e B\S>{i e B) 

< (e^^-l)+P(|/| >i7) =5(e,/f) (18) 
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For our class of channels we can choose H = H{e) such that KS{e, H{e)) < 1. 
We get 

aGA beB oSa feGfl wGWab 

<2\A\\B\ Yl {m^.H{e))Y 

d>d\st(A,B) 

The second inequality is obtained by noticing that the number of selfavoiding 
random walks of length \w\ is certainly bounded by K^'^^. The factor |v4||i?| 
accounts for the maximum possible number of initial and final vertices. The 
correlation decay of the theorem is in fact a special case of this last bound 
for the choice A = di and B = dj. □ 

We now look at GEXIT functions of the MAP and BP decoders. For 
LDGM codes the functional giving the MAP-GEXIT function in © is 

r(/ \\ ^'(^) f ^1 dcjlj) ^ J l + (xi)otanh/i1 

^((^^)o) = p7(iy y *^E,"V ^^^^^^^^ I (20) 

Derivations of this formula can be found in [1], |13j . 

The BP-GEXIT curve is given by the same functionals with (xj)o replaced 
by {xi)Q^. Consider Nd{i) the neigborhood of node i, radius d an even integer 
(all the vertices at graph-distance less or equal to d from i). As is well known 
for an ensemble LDGM(A,P) with bounded degrees, given d, if n is large 
enough, the probability that Nd{i) is a tree is 1 — 0(— ) (where 7 depends 
only on /max? ^max)- Thus when d is fixed and n — > +00 the computational 
tree Td{i) and the neighborhood Nd{i) match with high probability. This 
implies that 

lim lim g^^{e) = Ec[gi{xi)o,N,ii))\Nd{{} is a tree] (21) 

n— >+oo ' 

where {xi)o^Nd{i) is the Gibbs bracket associated to the subgraph Nd{i). The 
right hand side can exactly computed by performing the statistical mechan- 
ical sums on a tree and yields the DE formulas 

RP, , , A'(l) f dcil)^ , fl +tanhAWtanh/l 

m m of ^ e = m ^4 / c^^^Eam In <^ — \ 

^ d^ooP'{l)J de ^ \ 1 + tanh/ J 

(22) 
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where both hmits exist and 

k 

tanhA^*^) = JJtanhwf^ (23) 

1=1 

The v^'^^ are i.i.d random variables with distribution obtained from the iter- 
ative system of DE equations 

1^1 i-i 

^^'^(^) = E / UdnS'^iu,)S{v -J2u,) (24) 

I ^ ) i=i i=i 

1 7— J n k 1 k 1 

k \ ) J a=l i=l 

(25) 



with the initial condition r]^'^\v) = 5{v). It is well known that these equations 
are an iterative version of the replica fixed point equation 



Proof of corollaryUl LDGM. Expanding the logarithm in (|20|) and using Nishi- 
mori identities as in [13] we obtain the expansion 



;l^i:^^(Ec,.«[(^.)?]-i) (26) 



p=l 



where we recall that 

T2p{e) = — dlc{l){iaji\ilfP (27) 

J —CO 

Note that in order to get the above expansion, it is important to use fl20!) 
as expressed here in terms of the extrinsic estimate. Obviously, the series is 
absolutely convergent, uniformly with respect to ra, for the class of channels 
/C. Thus by dominated convergence, the proof will be complete if we show 
that 

lim Eci,A.[(x,)on = lim E^(d) [(tanh A^'^))^^] (28) 



Indeed one can then compute the n +00 limit term by term in (1261) 
and then resum the resulting series (which is again absolutely convergent, 
uniformly with respect to d) to obtain 
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Let us show (1251) . As pointed out before for d fixed and n +00, Nd{i) 
is a tree with high probabihty. Thus, 

hm in\i[{xi)ff] = hm ir,\i[{xi)l^\Nd{i) is a tree] (29) 

Notice that all paths connecting the bit i with those outside Na{i) have a 
length at least equal to d, so because of Theorem 1 in the high noise regime 
Xi is very weakly correlated to the complement of Nd{i). Therefore we may 
expect that 



lim lim Ec,p\.[|(xi)o^ - (xi)o^^ ,.J I iVrf(i)tree] = (30) 

Assuming for a moment that this is true we get from fl2^ . 

lim Ec inv[{xi)l^] = lim lim ln\^[{xi)l\ ,JNd{i) is a tree] (31) 

and, when Nd{i) is a tree, the Gibbs average (a;j)Q^jVd(i) explicitly com- 
putable and the right hand side of reduces to 

lim E^M[(tanhA(^))2p] (32) 

d— >+oo 

This proves fl28|) . 

Our task is now to prove fl5Ul) . Let ^'^(i) be the set of checks that are 

at distance d from i. We order the checks G Nd{i) in a given (arbitrary) 

way, and call {—)o-<k the Gibbs average with Ik = for the k first checks 

of N(i{i) (and /j = for the root node). For the first one (call it 1) we use 
^hxi _ (3Qg]-^ _|_ gj]^}^ 

/ . / . tanh/i((xiXi)o;<i - (xi)o;<i(xi)o;<i) 

(xi)o = {xi)o-<i + \ ^/ \ 

1 + \Xi}o;<i tanh/i 

Therefore 

\{xi)l^ - (x,)of<i| < 2p|(xi)o - {xi)o 



where 



<i| 

< 2j9ti|(xiXi)o;<i - (2;i)o;<i(a;i)o;<i| (34) 



(35) 





tanh 4 




1 - 




tanh/fc 
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We can now take the second check of A^d(i) (call it 2) and show 

l(^«)of<l - (a;i)of<2l < 2pt2Ka;iX2)o;<2 - {Xi)o-<2{x2)o;<2\ (36) 

We can repeat this argument for all nodes of dNd{i) and use the triangle 
inequality to obtain 

- (a;i)oJVd(i)l - 2P X] tk\{XiXk)o;<k - {Xi)o;<k{Xk)o;<k\ (37) 

Indeed the Gibbs average with all = for all k G Nd{i) is equal to (a;j)o,7Vd(j). 
Now using the bound f[T^ in the proof of theorem [T] for K6{e) < 1, the last 
inequality implies 

Ec,Pv[l(^.)o' - (^.)?;v,«l I ^^(^) tree] < (38) 
Note that for channels /C, for non-zero noise, 

E[t] = E [^^^M^j < E[e2|"] < oo (39) 
[1 — I tanh/| J 

The right hand side of fl38l) does not depend on n, so it is immediate that 
lim^^+oo lim„^+oo vanishes as long as the noise is high enough such that 
K'^6{e) < 1. This proves (l30i) and the corollary. □ 

To conclude, let us remark that, for the BIAWGNC the GEXIT formulas 
simplify considerably and there is a clear relationship to the magnetization, 

= ^J^i^ - Ep[tanh(/ + tanh-^(x,)o)]) (40) 

and 

hm hm g^^2e) = ^^(1 " E,^,.) [tanh(/ + A^-^))]) (41) 

The proof of corrolary 1 for BIAWGNC can thus proceed without expansions 
and is slightly simpler. The main ideas can be found in [17] and we do not 
repeat them here. Note also that for the BEG there are similar simplifications 
that occur: this allows us to make a proof which avoids the second condition 
in the class of channels JC. 
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4 LDPC Codes: Low Noise 



In this section we prove theorem [T] and corollary [T] for LDPC codes in a 
low noise regime. As explained in section [2] we first transform the problem 
to a dual one. The duality transformation reviewed here essentially is an 
application of Poisson's summation formula over commutative groups, and 
has been thoroughly discussed in the context of codes on graphs in [2l]. Here 
we need to know how the correlations transform under the duality, a point 
that does not seem to appear in the related literature. 

4.1 Duality formulas for the correlations 

Let C be a binary parity check code and C"*- its dual. We apply the Poisson 
summation formula 

where the Fourier (or Hadamard) transform is, 

/(r")= J2 /(x")e*^^"=i(i-^^)(i-^^) (43) 

a;"e{-l,+l}" 

to the partition function Z of an LDPC code C. The dual code C*-*" is an 
LDGM with codewords given by r" where 

r^=l[ua (44) 

and Ua are the m information bits. A straigthforward application of the 
Poisson formula then yields the extended form of the MacWilliams identity, 

Z = ^eS"-'^Z^ (45) 

where 

n 

M^ej-li+l}™ «=1 aedi 

This expression formaly looks like the partition function of an LDGM code 
with "channel half-loglikelihoods" Qi such that tanhgfj = e~^'\ This is truly 
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the case only for the BEC(e) where li = 0, +00 and hence Qi = +00, which 
still correspond to a BEC(1 — e). The logarithm of partition functions is 
related to the input-output entropy and one recovers (taking the e derivative) 
the well known duality relation between EXIT functions of a code and its 
dual on the BEC [25]. For other channels however this is at best a formal 
(but still useful) analogy since the weights are negative for < (and Qi 
takes complex values). We introduce a bracket (— )_l which is not a true 
probabilistic expectation (but it is still linear) 

1 " 
(/)± = ^ E fi^nU^l + e-'^^Uua) (47) 

The denominator may vanish, but it can be shown that when this happens 
the numerator also does so in a way that ensures the finiteness of the ratio 
(this becomes clear in subsequent calculations). Taking logarithm of (1451) 
and then the derivative with respect to k we find 

ix.) = ^ (48) 

tanh 2li sinh 2li 

and differentiating once more with respect to Ij, j 7^ ^ 

{x,x,) [x,)[x,) sinh 2/, sinh 2/, 

We stress that in (1481) . (l49l) . Tj and Tj are given by products of information 
bits (144|) . The left hand side of (l48l) is obviously bounded. It is less obvious 
to see this directly on the right hand side and here we just note that the 
pole at /j = is harmless since, for li = 0, the bracket has all its "weight" 
on configurations with Xj = 1. Similar remarks apply to fH9|) . In any case, 
we will beat the poles by using the following trick. For any < s < 1 and 
\a\ < 1 we have \a\ < |a|*, thus 

Ein[\{x,Xj) - {x,){xj)\] < 2^~'Ei4\{x,Xj) - {xi){xj)\'] 

and using fH9l) and Cauchy-Schwarz 



Ein[\{x,x,) - {x,){x,)\] < 2i-^E[(sinh2/)-2-]E^.[|(r,r,)x - {n)±{r,)^\^f/^ 

(50) 
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The prefactor is always finite for < s < | for our class of channels /C. For 
example for the BIAWGNC we have 

E[(sinh 2ir'^] < ^^e-^"^ (51) 
for purely numerical constants c, c' > and for the BSC we have 

E[(sinh2/)--]<fMi^y^ (52) 



4.2 Decay of correlations for low noise 

We will prove the decay of correlations by applying a high temperature cluster 
expansion technique to E;n[|(rjr,)_L — (Tj)_L(rj)j_p*]. As explained in section 
[2] we need a technique that does not use the positivity of the Gibbs weights. 
In appendix B we give a streamlined derivation of an adaptation of Berretti's 
expansion. 

X 

where 

K.AX)^ J2 E ^i''-rh{r^-rf)llE>. (54) 

(1) (2) r compatible k^F 

and 

J5. = ri"e-"+rfe-"+r«rfe-'" (55) 

Here u'^^ and ni^^ are two independent copies of the information bits (these 
are also known as real replicas) and r^"'' = Haefc "^^"^ ■ To explain what are X 
and r we will refer to a-nodes (check nodes in the Tanner graph representing 
the LDPC code) and z-nodes (variable nodes in the Tanner graph representing 
the LDPC code). Given a subset 5* of nodes of the graph let dS be the 
subset of neighboring nodes. In fl53l) the sum over X is carried over clusters 
of a-nodes such that "X is connected via hyperedges": this means that a) 
X = dX for some connected subset X of z-nodes; b) X is connected if any 
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pair of i-nodes can be joined by a path all of whose variable nodes lie in X; 
c) X contains both di and dj. In the sum (1541) F is a set of i- nodes (all 
distinct). We say that T is compatible with X" if: (i) dTUdiU dj = X, 
(ii) dr n di ^ (j) and dT fl dj 7^ 0, (iii) there is a walk connecting di and dj 
such that all its variable nodes are in F. Finally, 

"a all i s.t. aGi 

The figure Fig. 14.21 gives an example for all the sets appearing above. 
We are now ready to prove the theorem on decay of correlations. 

Proof of theorem [T], LDPC. Because of (1501) it suffices to prove that 
lEz"[|(Tjrj)_L - {n) ±{Tj) decays. 
The first step is to prove 

Z^{X' 



all i s.t 



with 



< 1 (57) 

This ratio is not easily estimated directly because the weights in Z^i^ are not 
positive. However we can use the duality transformation fHSj) backwards to 
get a new ratio of partition functions with positive weights, 

z(x')^ E n 11^(1+ n (59) 

alHs.t aGX'^ iga and 

dinx=<f> dinx=cf) dinx=(j) 
This is the partition function corresponding to the subgraph induced by a- 
nodes of and z-nodes such that is.tdin X = (p. Moreover C'^{X'^) is the 
dual of the later code C{X'^) defined on the subgraph. By standard properties 
of the rank of a matrix, the rank of the parity check matrix of C(X'^), which is 
obtained by removing rows (checks) and columns (variables) from the parity 
check matrix of C, is smaller than the rank of the parity check matrix of C. 
This implies |C-'"(X^)| < |C^|. Moreover 



("exp l^Z{X')<Z 



(60) 



all i s.t 

ainXy^<j> 



21 



Figure 2: In this figure we explain the various sets appearing in the cluster 
expansion ( l53i) . The Tanner graph represents the LDPC code with variable 
nodes (i-nodes) denoted by circles and check nodes (a-nodes denoted by 
squares). In this example the set X is the set of dark check nodes. It is 
easy to verify that this choice of X satisfies all our conditions. Firstly, let 
X = {i, j,vi,V2,V3,V4,V5,VQ,VY,Vs,Vg,Vio,Vii,Vi2} be a set of variable nodes 
(these are denoted by dark circles in the figure). It is easy to check that 
the set of neighbours of X is given by the dark check nodes which is X. 
Hence X = dX. Secondly, any two variable nodes in X are connected by a 
path all of whose variable nodes lie in X, and thirdly, X contains both di 
and dj. One choice for F = {v2, ws, v^, v^, vq, v^, vs, viq, vn, vu}- It is easy to 
check that F is compatible with X. The walk {aiVi2a2V20'3V3aiVQa5VsaQVua7} 
connects di and dj and all its variable nodes lie in F. Another choice for F 
would be the set {v2,V3,V4,V5,ve,vr,vs,VQ,vio,vu,vi2}- In the definition of 
Z±{X''),Z{X'') the light vaiiable nodes, wis, ^'m, ^'is, ^'le, ^'i?, ^'is, ^'ig, ^^20, ^^21, 
are not present because they have a non-empty intersection with X. 
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To see this one must recognize that the left hand side is the sum of terms of 
Z corresponding to a" such that cTj = +1 for 9i fl X 7^ (and all terms are 
> 0). These remarks imply fl57|l . 

Using I Y.i OiP" < E» kip' for < 2s < 1 and ([57D we find 

Ep[|(T:.r,)^ - (r.)x(r,)J^^] < ^ ^^'"0^^^ (61) 

X 

Trivially bounding the spins in fl55l) we deduce 

r compatible 
withX 

<4ix| ^ 2(^^+i)iriA(e)iri (62) 

r compatible 
witliX 

where 

A(e) = 22^E[e-^^'] + E[e-^^'] (63) 

Since F is compatible with X we necessarily have \dV\ > \X\ — \di\ — \dj\ and 
since \dT\ < |r|/niax, we get |r| > {\X\ — 2/niax)/^max- Also, the maximum 
number of a-nodes which have an intersection with X is |X|A;niax- Thus there 
are at most 21^''^™'"' possible choices for T. These remarks imply 

Eir.[\Kij{X)\'^'] < 2(2+'=--)l^lA(e)(l^l-2'--)/'-- (64) 
From fl6Tl) and flM|) we get 

X 

The clusters X connect di and dj and thus have sizes \X\ > |dist(i,j). 
Moreover the number of clusters of a given size grows at most like K^^^ 
where K = Imaxkmax- Since for the class /C we have for s small enough, 
E[e'^] ^ as e -> we can always chose e small enough to make A(e) small 
enough and conclude the proof. □ 



23 



4.3 Density evolution equals MAP for low noise 

In the case of LDPC codes the functional giving the MAP-GEXIT function 
in dl]) is [H] 



de [ 1 + tanh U 

Note that the only formal difference with the LDGM case is in the normal- 
ization factor; but of course now the Gibbs average pertains to the LDPC 
measure. The BP-GEXIT curve is given by the same functionals with (xj)o 
replaced by the average on the computational tree {xi)^^. As in section [3] 
we introduce Na{i) the neigborhood of node i, radius d an even integer. By 
the same arguments than in section [3] we have again 

lim lim ^(f ^(e) = Ec[^((xi)o,7v^{i))|A^d(i) is a tree] (67) 

where (xi)o,Ar^(i) is the Gibbs bracket associated to the graph Nd{i). It is 
important to note that for Nd{i) a tree the set of leaves Nd{i) are variable 
nodes and have "natural boundary conditions" as given by the channel out- 
puts. The statistical mechanical sums on a tree yield the DE formula 

r r BPf. r f .M^)^ i / l + tanhA^'^) tanh/ | 
hm hm g^ ^ e = hm / d/— — E^(,) In <^ \ 68 

^+oo n^+oo ■ d^oo J de [ 1 + tanh / J 



where both limits exist and 



AW = 5^^W (69) 



a=l 



The w^^ are i.i.d random variables with distribution obtained from the iter- 
ative system of DE equations 

i~i i-i 



I ^ ) j=i j=i 



k a=l a=l 

with the initial condition 77^°^ (A) = c(A). As before, these equations are an 
iterative version of the replica fixed point equation 
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Proof of corollary 1, LDPC. The first few steps are tlie same as in tlie proof 
for LDGM. First, we expand the logarithm in fl66l) and use Nishimori identi- 
ties to obtain a series expansion hke fl26l) (the prefactor ^pr^ is now absent). 
Second, we notice that since the resulting series expansion is uniformly ab- 
solutely convergent it is enough to show that 

lim Ecpv[(x,)on= lim Erf[(tanh A^'^))^^] (70) 
Thirdly, as before, one argues that this follows from 



lim lim E 



c. 



inv[\{xif^ - I Nd{i) tree] = (71) 



and because of — a?^\ < 2p\h — a\ it is enough to show this for 2p replaced 
by 1. Unfortunately one cannot proceed as simply as in the LDGM case: 
(I7T]) is a consequence of the next two auxiliary lemmas stated below. □ 

Let (— )^7Vd(i) bracket defined on the subgraph N^ii) with 1^ = +oo 

for k G dNd{i). This in fact is formaly equivalent to fixing Xk = +1 boundary 
conditions on the leaves of the tree k G Ndi'i)- The first lemma says that the 
bit estimate can be computed locally. 

Lemma 2. Under the same conditions than in corollaryUl 

lim lim Ec [I {xi)o - (x^)^^ | | Nd{t) tree] = (72) 

The second lemma says that at low enough noise free and +1 boundary 
conditions are equivalent 

Lemma 3. Under the same conditions than in corollaryUl 

lim lim Ec^in\i[{xi)'^^M) - {xi)o,Na(i)\ I Nd{i) tree] = (73) 

We prove the first lemma. It will then be clear that the proof of the second 
one is essentialy the same except that the original full graph is replaced by 
Nd{i), and thus it will be spared. 

Proof of lemma\^ In f l72p (and (1731) ) the root node i has = which turns 
out to be technically cumbersome because we really work in a low noise 
regime. For this reason we use 

lr-\n + tanh/- i^d'^Aj ii\ + tanhL 

^ l + (a:.)otanh/,' ^"^^^^^^ 1 + tanh/, ^'^^ 
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to deduce 

(l-(tanh/0^)((x.)-(x.)-(,)) 
^^^^o.^^» (1 _ (^,) tanh/,)(l - (a;.)- tanh/, 

This implies 



(75) 



\{Xi)o - (Xi)~Ar^(j)| < ^ _ j ^g^j^^^.j K^i) ~ {^i)Na{i)\ i^^) 

and averaging over the noise and using Cauchy-Schwarz, 

E,.v[l(^.)o - {x^)^N,i^j\] < 2E[e^lY/'Ep[|(x,) - {x,)^^<^^\']'^' 

<2V2E[e«l'l]V2E,.[|(x,)-(x.)^^(,)|]V2 (77) 

Let us now prove 

hm hm Ecr [I - {xi)NM) I I ^di^ ^ree] = (78) 

We order the variable nodes at the boundary Nd{i) and consider the cor- 
responding vector of loglikelihoods with components G iVd(z). If the first 
k — 1 components of this vector are h, ...,4-1 = +oo, the fc-th component 
is and the other ones are i.i.d distributed as c(/) (in other words they are 
"natural") we write {—)'^f._i. From the fundamental theorem of calculus, it 
is not difficult to see that 

/ + 00 ^ 
d^k ^ <fc-l 

p+oo 

= - E / dli{{x.x,n_,- {x.)^,_,{xkn_,) (79) 
Using \a\ < |a|'^ for any < s < 1 and \a\ < 1 we get 

r+oo 

<2i- E / (80) 



keNaii) 
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Let (— )^^^^ be the dual bracket (with the first k components of Nd{i 
... = 4-1 = +00 and the k-th component equal to /J,). Because of 



we 



have 



\{Xi)-{Xi) 



00 I 

Ndii)\ 



< 2" 



00, _L 
<k~l 



I \ 00, _L / \ 00 



<fc-il 



(sinh 2li sinh 2/^)2^ 



(81) 



Note that the denominator in the integral is important to make the integral 
convergent for ^ 00. Moreover at U and = is harmless as long as for 
2s < 1. The next step is to use the cluster expansion in order to estimate 



(82) 



(sinh 2Zi sinh 2Z;^)2« 

By following similar steps than in the proof of theorem [1] one obtains an 
upper bound similar to (l64l) except that the likelihoods of the end points are 
weighted differently and therefore there are two factors of A(e) (see ( !63l) ) 
replaced by 

E " . ^„ — <oo and E / dl' 



(sinh 2/)- 



2s 



+ e 



~8sl' 



(sinh 21') 



-2s 



< 00 



(83) 

Finally we can average over the code ensemble conditional on the event that 
Nd{i) is a tree. Since the clusters X that connect di and dk, k G Nd{i) have 
size |X|A;max we obtain the result as long as A(e) is small enough, for e small 
enough. □ 



5 Large block length versus large number of 
iterations 

In the LDGM case we prove the exchange of limits d,n ^ +00 for the 
BSC channel. As will become clear one needs the decay of correlations (or 
covariance) of the Gibbs measure on the computational tree for d ^ n. 
Hence the likelihoods are not independent r.v: the proof of theorem [1] still 
goes through in the case of the BSC. The only difference is that in lemma [T] 
we can take > | In such that i3 = and p^rO) = e^'^'^tj)' — 1 
for all j G Td{i). 



4|l-2e| 
(l"|l-2e|)2 
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Lemma 4 (Decay of correlations for the BP decoder, LDGM on BSC). 

Consider communication with a fixed general LDGM code with blocklength 
size n and bounded degrees /max, /j^max, over the BSC(e). We can find c > 0, 
a small enough numerical constant such t/iat /or /max ^max|l — 2e| < c we have, 
for any given realization of the channel outputs, 

Kx.x,)r - (x.)r(x,)ri < cie--^^)'^'-^^'^) (84) 

where i is the root of the computational tree, j and arbitrary node, Ci > a 
numerical constant and C2(e) > depending only on e, /max, kraax- Moreover 
C2(e) increases like In |1 — 2e| as e — > |. 

Basically, this result is contained in [22] where it is obtained by Do- 
brushin's criterion. Note that it is valid for fixed noise realizations and not 
only on average. The unbounded case would require to take averages but 
then, on the computational tree one has to control moments E[p7r(0'"] 
this requires more work. The following proof is a simple application of this 
lemma. 

Proof of theorem 2, LDGM, BSC. We take for the number of iterations of 
the BP decoder d ^ n. On the computational tree Tdii) we consider the 
subtree of root i and depth d' <^ n. This subtree is a smaller computational 
tree Td'ii) C Tdii) and d' <^ n <^ d. Let Td'(i) the leaves k with dist(2, k) = d' 
and order them in an arbitrary way. Consider the Gibbs measure {—)d-<k 
where for the first k checks of Td'{i) we set /^(fc) = in (|T0|) . Proceeding as 
in section [3] we get 

\{^i)d^ ^ {^i)d''^\ ^ tTT{k)\{XiXk)d;<k'~{^i)d;<k{^k)d-^k\ (8^) 

For the BSC, t7r(fc) = jrrjYi^- Fro^i lemma H] for |1 — 2e| small enough (but 
independent of n, d) 

K^.)r - (^Ori = 0(A'^'e-^^(^)'^') (86) 

In this equation 0(— ) is uniformly bounded with respect to n and d (and the 
noise realizations of course). Recall the GEXIT function of the BP decoder 

A'(l) f,,dc{k) f l + {xi)^^^tanhk \ 

9nAe) = ^ y dk^E,,.,. ln| ^^^^^^^^ | (87) 
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Since for |1 — 2e| <C 1, | tanh/j| = i| ln^| <^ 1, one can easily show 

9nAe) = 9nA^) + 0(A'^'e-^^(^)'^') (88) 

For example one could proceed by expanding the In in powers of |tanhZj| 
and estimate the series term by term. Now since 0{—) is uniformly bounded 
with respect to n, ci (IHHl) implies for d! fixed 

lim liminf(7„,rf(e) = lim Qr^A^) + 0{K''' e-'"'^'^"') (89) 

n— »+oo d— >oo n— »+oo 

Now we take the limit d! — > +oo, 

lim Ivcami gn,d{e) = lim lim gn,d'{e) (90) 

n— >+oo d^oo d'^+Qon^+oo 

A similar result with lim sup replacing lim inf is derived in the same way. □ 

6 Conclusion 

In this paper we have shown that cluster expansion techniques of statistical 
mechanics are a valuable tool for the theory of error correcting codes on 
graphs. We have not investigated the regimes of high noise for LDPC codes 
and low noise for LDGM codes. In the case of LDPC codes and high noise we 
are able to prove decay of correlations for ensembles that contain a sufficient 
fraction of degree one variable nodes. Indeed one can eliminate the degree 
one nodes and convert the problem to a new graphical model containing a 
mixture of hard parity check constraints and soft LDGM type weigths. If 
the density of soft weights is high enough the analysis of the present paper 
can be extended (see [22] for a summary). Combining theses ideas with 
duality one may also treat special ensembles of LDGM codes for low noise. 
This approach however is not entirely satisfactory and it is not clear how to 
directly go about with cluster expansions in these regimes. 

We hope that the ideas and techniques investigated in the present work 
could have other applications in coding theory and more broadly random 
graphical models. Let us mention that various forms of correlation decay have 
been investigated recently for the random i^-SAT problem at low constraint 
density, by different methods [27j. This has allowed the authors to prove 
that the replica symmetric solution is exact at low constraint density. In 
the authors derive a new type of expansion called "loop expansion" in 
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an attempt to compute corrections to BP equations. The link to traditioanl 
cluster expansions is unclear to us, and also it would be interesting to develop 
rigorous methods to control the loop expansions. Finaly, we would also like 
to point out the work [29] where a new derivation of the Gilbert- Varshamov 
bound is presented using the Mayer expansion for a hard-sphere systems. 
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A Cluster Expansions 

In this appendix we explain the derivation of the two cluster expansions 
that we use. In the statistical mechanics literature these have been derived 
for spin systems with pair interactions on regular graphs. It turns out that 
they can be adapted to our setting. We try to give a self-contained by still 
reasonably short derivation here. 

A.l Cluster expansion for LDGM codes 

Here we adapt the cluster expansion of Dreifus-Klein-Perez in [20]. In the 
process we prove lemma [T] stated in section [31 It will be very convenient to 
use the following compact notation 

JJ^ Ma = Mx, for any set Xc{l,...,m} (91) 

In particular the code-bits Xi = Ylaedi'^a become Udi, i = l,...,n and the 
correlation of lemma [1] becomes 

{uaUb) - {ua){ub) (92) 

It is first necessary to rewrite the Gibbs measure ([6]) in a form such that the 
exponent is positive 

^ m _ m 

_ -Q ^kx. = — n e''"*+l''l (93) 

1=1 i=l 
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where Z' is the appropriately modified partition function. We introduce the 
rephcated measure, which is the product of two copies, 

^ m 

J_ JJ e;.(4V+«e?)+2|«d (94) 

Thus we now have two rephcas of the information-bits ^ uwi and 

The Gibbs bracket for the rephcated measure is denoted by (— )i2. It is easy 

to see that 

{uaUb) - {ua){ub) = liiu^l^ - uf){u^^^ - (95) 
Recall that B = {i \ > H} for some fixed number H, and set 

e^«(4V+4?)+2K.I -i=Ki (96) 
It will be important to keep in mind later that Ki > 0. We have 
i((n«-nj))(n«-ng))),, 

GCB<:„(i),u(2) ieB iGG 

where /x = - X = A, 5. 

Take a term with given G C in the last sum. We say that "G connects 
A and S" if and only if there exist a self-avoiding wall|^ Wab with initial 
variable node a e A final variable node b E B and such that all check nodes 
of Wab are in G U ffl The crucial point is that: if a set G does not connect 
A and i?, then it gives a vanishing contribution to the sum. We defer the 



^See section [3] for the definition of these walks. 

^Note that it is really GU B that connects A and B. Since B is fixed our definition is 
valid 
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proof of this fact to the end of this section. For the moment let us show that 
it imphes the bound in lemma [H The positivity of Ki implies 

\{uaUb) - {ua){ub)\ 

GCB" -u(i),«(2) iGB iGG 

G connects A and B 

E E E n-'^"^^'""''^"'" U^U^^ (98) 

In the second inequality we used < e^''^' — 1 = Now resumming over 
G' C B'^\w we obtain 

\{uaUb) - {ua){ub)\ 

^ E n E n^'^""'^^'''^^'"'' n 

w&Wab iew\B ieB iel3''\w 

=2 E n ^- 

w&Vab iew\B 

(99) 

The second inequality follows by inserting extra terms 1+Ki > 1 for i G w\B, 
and the second by reconstituting Z'^ in the numerator. Now, the last line is 
equal to 

2 Yl p^ = 1,1 e B and p, = i),,i^B (100) 

Hence the bound (fT6l) . 

It remains to explain why, if G does not connect A and B, the G-term does 
not contribute to (1971) . Let dG U dB be the set of variable nodes connected 
to the check nodes GUB. We define a partition dG UdB = Va U U Vb into 
three sets of variable nodes. Va is the set of all variable nodes v such that 
there exist a self-avoiding walk Wav connecting some a ^ A to v, and such 
that all ckeck nodes of Wav are in GU B. Vb is similarly defined with B and 
b e B instead of A. Finally Vb = {dG U 9-B) \ {Va U Vb). By construction 
Vc riVA = Vc CiVb = 0- The point is that if G does not connect A and B, 
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Figure 3: On the LDGM graph, B is depicted by the dark squares. The set 
G C is depicted by the hght squares. A and B contain both three nodes; 
there does not exist a self-avoiding walk that connects these two sets with all 
its check nodes in G. The sets of variable nodes Va, Vb and Vc are disjoint 
as well as the sets of check nodes Ga-, Gb and Gc'- these sets are enclosed in 
the dotted areas. 



then n Vb = 0. Indeed, otherwise there would be a n G V4 fl V^ with a 
walk Wau and a walk Wuh both with all check nodes m GUB, but this would 
mean that G connects A and B through the walk Wau U Wub- We also define 
three sets of check nodes Ga = (G U i3) fl SVa, Gb = {GUB)n dVs and 
Cc = {GUB) \ {Ga^Gb)- Again the three sets are disjoint when G does not 
connect A and B: indeed if there exists c G Ga H Gb then c belongs to both 
Va and Vb which we just argued is impossible. This situation is depicted on 
figure (3). 

Now we examine a term of fl97j) for a G that does not connect A and B. 
Expanding the product JaJb, using linearity of the bracket and symmetry 
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under exchange of replicas (1) ^ (2), it is equal to the difference I — II 
where 

«{i),-u(2) i&B i&G 

Because of the disjointness of the sets Va,b,c and Ca,b,c (the areas enclosed 
in dotted lines, see figure (3)) one can, in / and //, factor the sums „{2) 
in a product of three terms (in fact there is a fourth trivial term which is a 
power of 2 coming from the bits outside the dotted areas). Then by symmetry 
1 2 one recognizes that / = //. Thus I — II and this proves that G does 
not contribute to ( 1971) when it does not connect A and B. 



A. 2 Cluster expansion for LDPC codes 

Here we adapt the Berretti cluster expansion to our setting. For more details 
we refer to [21], [19]. Consider the rej^/zcaiec? partition function 



E W + r^u'e-''^){^ + r^:'e-'^^) (103) 

«(1),m(2)g{-1,+1}'" k=l 

here n*-^^ = u"i \ . . . , u^] u^^^ = u"i \ . . . , are two replicas of the informa- 
tion bits and r^^^ = riaefc^^^^^^ = IlaGfc"^^^- We have 

inr,)^ - {n)Ar,)^ = - rf ))(rf ^ - rf ))x,i2 (104) 

where (■)±,i2 corresponds to the rephcated system. We denote = t^^^ — t^'^\ 
fj = Tj^^ — rj^"* . Then we have 

{Wl,i2 = ^ E f^f^ 11(1 + E,) (105) 

^ ■u(l),n(2) k 

where Ei is defined in ( l55l) . Expanding the product we get, 

{fifj)±,i2 = ^ E •^'•^j' E n 

-L „(i) „(2) VcSJfcGV 



^E E mH^'^ (106) 



-L yc2JM(i),«(2) fcGV 
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where 5J denotes the set of all variable nodes of the original Tanner graph 
for the LDPC code and V is any subset of distinct variable nodes. Suppose 
y C 5J is such that one cannot create a walk (i.e. on the original Tannger 
graph of the LDPC code, a set of alternating variable and check nodes) 
connecting any check node in di, to any check node in dj, and which has 
all its variable nodes contained entirely in V. Then we can partition V into 
three mutually disjoint sets of variable nodes, Vi,V2,V3 such that Vi 3 i, 
V2 3 j and V3 = V\{ViU V2). Note also that dVi,dV2,dV3 are mutually 
disjoint otherwise we can create a walk between di and dj. Thus we can 
write 

X E n^'^ (107) 

This implies that (11071) vanishes. This is seen by using the antisymmetry 
of fi (or fj) and the symmetry of Ej^, under the exchange (1) ^ (2). Thus 
only those V which contain a walk with all its variable nodes in V and which 
intersects both di and dj contributes to the sum in (I106p . 

For any given V (contributing to the sum) we construct the set of variable 
nodes Ty as follows. Ty is the union of all maximal connected clusters 
of distinct variable nodes in V, such that each of those connected clusters 
intersects di U dj. Let Ty = V \ Ty. Clearly, there exists such a set because 
we know that the walk which connects di and dj is a subset of Ty. Let 
Xv = dTv UdiU dj be a set of check nodes. It is not difficult to see that Xy 
satisfies all the requirements of the set X in the sum (!53|) . Indeed, consider 
Xy = Fy U i U j. By construction dXy = Xy', any two variable nodes in Xy 
are connected by a walk with all its variable nodes in Xy; Xy contains both 
di and dj. Also note that Ty is compatible with Xy as is required in the 
sum Indeed, by construction dTy U di U dj = Xy; dTy fl 9i 7^ and 

dVy n dj ^ (p; there exists a walk between di and dj with all its variable 
nodes in Vy. 
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With this we can write 



-L yc23 , (1) , (2) fcerv „(i) ,,(2) fcer?, 

aEOFvU^iU^j remaining a 

^l'"'^ aGX remaining a 

Now we resum over the sets V such that Xy = These consist of F 
compatible with X and the rest Q which does not intersect X. So 

^ i- rcompatibio ker J I , (i) „(2) gcvo k<^g 

=^e( E E /./.n4| E n d+^j} 

^ ;f L „{2) rcompatible fcGF j I „(2) all fc s.t. ) 



(109) 

The last bracket is equal to (!56|) and we recognize Berretti's expansion. Fig- 
ure (4) shows a sample set V and Ty which give a non- vanishing contribution. 
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Figure 4: The dark variable nodes form the set V = {vi, . . . ,viq}. The 
walk aiVi2a2V2asV3a4VQa5VgaQViia7 connects di to dj and hence this V has a 
non- vanishing contribution. Ty = {vi, . . . , ^12}, is union of the two maximal 
connected clusters {vi, V13} and {^12, V2, V3, W4, V5, vq, vj, vs, vq, viq, vu} which 
has intersection with di U dj. Ty = {vu, f 15, wie}- 
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